A Reduction from Apprenticeship Learning to Classification
نویسندگان
چکیده
We provide new theoretical results for apprenticeship learning, a variant of reinforcement learning in which the true reward function is unknown, and the goal is to perform well relative to an observed expert. We study a common approach to learning from expert demonstrations: using a classification algorithm to learn to imitate the expert’s behavior. Although this straightforward learning strategy is widely-used in practice, it has been subject to very little formal analysis. We prove that, if the learned classifier has error rate ǫ, the difference between the value of the apprentice’s policy and the expert’s policy is O( √ ǫ). Further, we prove that this difference is only O(ǫ) when the expert’s policy is close to optimal. This latter result has an important practical consequence: Not only does imitating a near-optimal expert result in a better policy, but far fewer demonstrations are required to successfully imitate such an expert. This suggests an opportunity for substantial savings whenever the expert is known to be good, but demonstrations are expensive or difficult to obtain.
منابع مشابه
Comparison of Students’ Perception of Preparedness for Interprofessional learning readiness in apprenticeship and apprenticeship on site in Schools of Nursing and Midwifery of Islamic Azad Universities in Isfahan, Iran in 2018
Background & Objective: Interprofessional education (IPE) is one of the new approaches in the education of students in health-related disciplines. This type of training can increase interprofessional collaborations, thereby improving patient care quality. This study aimed to compare the perception of IPE in students apprenticeship and apprenticeship on site in schools of nursing and midwifery o...
متن کاملKnowledge Base Refinement Using Apprenticeship Learning Techniques
This paper describes how apprenticeship learning techniques can be used to refine the knowledge base of an expert system for heuristic classification problems. The described method is an alternative to the long-standing practice of creating such knowledge bases via induction from examples. The form of apprenticeship learning discussed in this paper is a form of learning by watching, in which le...
متن کاملBoosted and reward-regularized classification for apprenticeship learning
This paper deals with the problem of learning from demonstrations, where an agent called the apprentice tries to learn a behavior from demonstrations of another agent called the expert. To address this problem, we place ourselves into the Markov Decision Process (MDP) framework, which is well suited for sequential decision making problems. A way to tackle this problem is to reduce it to classif...
متن کاملIRDDS: Instance reduction based on Distance-based decision surface
In instance-based learning, a training set is given to a classifier for classifying new instances. In practice, not all information in the training set is useful for classifiers. Therefore, it is convenient to discard irrelevant instances from the training set. This process is known as instance reduction, which is an important task for classifiers since through this process the time for classif...
متن کاملApprenticeship Learning: Transfer of Knowledge via Dataset Augmentation
In visual category recognition there is often a trade-off between fast and powerful classifiers. Complex models often have superior performance to simple ones but are computationally too expensive for many applications. At the same time the performance of simple classifiers is not necessarily limited only by their flexibility but also by the amount of labelled data available for training. We pr...
متن کامل